27 research outputs found

    Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets

    Get PDF
    COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including proteomic, metabolomic and transcriptomic measurements. Pathway analysis of the selected features was also performed. Analysis of a combined proteomic and metabolomic dataset led to 10 equivalent signatures of two features each, with AUC 0.840 (CI 0.723–0.941) in discriminating severe from non-severe COVID-19 patients. A transcriptomic dataset led to two equivalent signatures of eight features each, with AUC 0.914 (CI 0.865–0.955) in identifying COVID-19 patients from those with a different acute respiratory illness. Another transcriptomic dataset led to two equivalent signatures of nine features each, with AUC 0.967 (CI 0.899–0.996) in identifying COVID-19 patients from virus-free individuals. Signature predictive performance remained high upon validation. Multiple new features emerged and pathway analysis revealed biological relevance by implication in Viral mRNA Translation, Interferon gamma signaling and Innate Immune System pathways. In conclusion, AutoML analysis led to multiple biosignatures of high predictive performance, with reduced features and large choice of alternative predictors. These favorable characteristics are eminent for development of cost-effective assays to contribute to better disease management

    Machine learning approaches in microbiome research: challenges and best practices

    Get PDF
    Microbiome data predictive analysis within a machine learning (ML) workflow presents numerous domain-specific challenges involving preprocessing, feature selection, predictive modeling, performance estimation, model interpretation, and the extraction of biological information from the results. To assist decision-making, we offer a set of recommendations on algorithm selection, pipeline creation and evaluation, stemming from the COST Action ML4Microbiome. We compared the suggested approaches on a multi-cohort shotgun metagenomics dataset of colorectal cancer patients, focusing on their performance in disease diagnosis and biomarker discovery. It is demonstrated that the use of compositional transformations and filtering methods as part of data preprocessing does not always improve the predictive performance of a model. In contrast, the multivariate feature selection, such as the Statistically Equivalent Signatures algorithm, was effective in reducing the classification error. When validated on a separate test dataset, this algorithm in combination with random forest modeling, provided the most accurate performance estimates. Lastly, we showed how linear modeling by logistic regression coupled with visualization techniques such as Individual Conditional Expectation (ICE) plots can yield interpretable results and offer biological insights. These findings are significant for clinicians and non-experts alike in translational applications

    Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

    Get PDF
    The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach

    Advancing microbiome research with machine learning : key findings from the ML4Microbiome COST action

    Get PDF
    The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices

    Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

    Get PDF
    The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach

    In vivo μοριακή απεικόνιση επιθυλιακών προ - καρκίνων βασιζόμενη στη μοντελοποίηση της δυναμικής οπτικής σκέδασης

    No full text
    We present a novel biophotonic method and imaging modality for estimating and mapping neoplasia-specific functional and structural parameters of the cervical precancerous epitheli-um. Estimations were based on experimental data obtained from contrast-enhanced optical imaging of cervix, in vivo. The dynamic characteristics of the measured optical signal are governed by the epithelial transport effects of the biomarker. A compartmental, pharmacoki-netic, model of the cervical neoplastic epithelium has been developed. Nine structural and functional parameters have been identified to be potentially correlated with the neoplasia growth and to be manifested to the measured data in a convoluted manner. We performed Global Sensitivity Analysis for the purpose of identifying key determinants of the model’s out-put. We have shown that it is possible to estimate the following neoplasia related parameters: number of neoplastic layers, extracellular space dimensions, functionality of tight junctions and extracellular pH. Global optimization techniques showed that the estimations of our meth-od are of adequate accuracy and precision. Particularly, the Differential Evolution algorithm converged to these four parameters with an error of roughly 1%. We show that the estimated values are quite consistent with information provided in the literature. Our results are unique in the sense that for the first time functional and microstructural parameter maps can be esti-mated and displayed together, thus maximizing the diagnostic information. The quantity and the quality of this information are unattainable by other invasive and non invasive methods. The findings of this thesis suggest strongly that our method may become a valuable diagnos-tic tool that will facilitate the development and evaluation of new cancer therapies.Παρουσιάζουμε μία καινοτόμο βιοφωτονική μέθοδο μοριακής απεικόνισης για την εκτίμηση και χαρτογράφηση των λειτουργικών και δομικών παραμέτρων του επιθηλίου του τραχή-λου της μήτρας κατά τη διάρκεια εξέλιξης της νεοπλασίας. Οι εκτιμήσεις βασίστηκαν σε in vivo πειραματικά δεδομένα οπτικής απεικόνισης ενισχυμένης-αντίθεσης. Τα δυναμικά χα-ρακτηριστικά του μετρούμενου οπτικού σήματος διέπονται από τα φαινόμενα επιθηλιακής μεταφοράς του βιοδείκτη. Βάσει αυτών αναπτύχθηκε ένα διαμερισματικό, φαρμακοκινητικό μοντέλο του νεοπλαστικού επιθηλίου. Οι παράμετροι που συσχετίζονται με την ανάπτυξη νεοπλασίας εμφανίζονται στα μετρούμενα δεδομένα με περίπλοκο τρόπο. Έτσι, βάσει κα-θολικής ανάλυσης ευαισθησίας εντοπίστηκε το υποσύνολο των παραμέτρων εισόδου που αποτελούν τους πιο καθοριστικούς παράγοντες. Αποδείξαμε ότι είναι δυνατόν να εκτιμη-θούν οι ακόλουθες παράμετροι: ο αριθμός των νεοπλαστικών στρωμάτων, οι διαστάσεις του εξωκυττάριου χώρου, η λειτουργικότητα των σφιχτών δεσμών και το εξωκυττάριο pH. Τεχνικές καθολικής βελτιστοποίησης, έδειξαν ότι οι εκτιμήσεις της μεθόδου μας είναι επαρ-κείς ως προς την ορθότητα και την ακρίβειά τους. Ιδιαίτερα, οι εκτιμήσεις τoυ αλγόριθμου της Διαφορικής Εξέλιξης συνέκλιναν με σφάλμα περίπου 1%. Κατόπιν τούτου, δείχνουμε ότι οι εκτιμώμενες τιμές είναι αρκετά συνεπής με τις πληροφορίες που παρέχονται στη βιβλιογραφία. Τα απο-τελέσματά μας είναι μοναδικά υπό την έννοια ότι χάρτες των λειτουργικών και μικροδομι-κών παραμέτρων μπορούν να υπολογίζονται και εμφανίζονται μαζί για πρώτη φορά, μεγι-στοποιώντας κατά συνέπεια τη διαγνωστική πληροφορία

    GAN-Based Training of Semi-Interpretable Generators for Biological Data Interpolation and Augmentation

    No full text
    Single-cell measurements incorporate invaluable information regarding the state of each cell and its underlying regulatory mechanisms. The popularity and use of single-cell measurements are constantly growing. Despite the typically large number of collected data, the under-representation of important cell (sub-)populations negatively affects down-stream analysis and its robustness. Therefore, the enrichment of biological datasets with samples that belong to a rare state or manifold is overall advantageous. In this work, we train families of generative models via the minimization of Rényi divergence resulting in an adversarial training framework. Apart from the standard neural network-based models, we propose families of semi-interpretable generative models. The proposed models are further tailored to generate realistic gene expression measurements, whose characteristics include zero-inflation and sparsity, without the need of any data pre-processing. Explicit factors of the data such as measurement time, state or cluster are taken into account by our generative models as conditional variables. We train the proposed conditional models and compare them against the state-of-the-art on a range of synthetic and real datasets and demonstrate their ability to accurately perform data interpolation and augmentation

    In vivo molecular imaging of epithelial pre-cancers based on dynamic optical scattering modeling

    No full text
    Διατριβή που υποβλήθηκε για τη μερική κάλυψη των αναγκών απόκτησης του Διδακτορικού Διπλώματος στη Σχολή ΗΜΜΥA dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the ECE SchoolSummarization: We present a novel biophotonic method and imaging modality for estimating and mapping neoplasia-specific functional and structural parameters of the cervical precancerous epithelium. Estimations were based on experimental data obtained from dynamic contrast-enhanced optical imaging of cervix, in vivo. The dynamic characteristics of the measured optical signal are governed by the epithelial transport effects of the biomarker. A compartmental, pharmacokinetic, model of the cervical neoplastic epithelium has been developed, which predicts the dynamic optical effects in all possible parameter value combinations. Nine biological parameters, both structural and func-tional, have been identified to be potentially correlated with the neoplasia growth and to be mani-fested to the measured data in a convoluted manner. We have performed Global Sensitivity Analy-sis for the purpose of identifying the subset of the input parameters that are the key determinants of the model’s output. We have for the first time shown that it is possible to estimate, from in vivo measured dynamic optical data, the following neoplasia related parameters: number of neoplastic layers, extracellular space dimensions, functionality of tight junctions and extracellular pH. Global optimization techniques showed that the estimations of our method are of adequate accuracy and precision. Particularly, the Differential Evolution algorithm converged to the set of the four, most identifiable, parameters with an error of roughly 1%. We show that the estimated, in two millions of pixels, values of the four parameters are quite consistent with information provided in the literature. Our results are unique in the sense that for the first time functional and microstructural parameter maps can be estimated and displayed together, thus maximizing the diagnostic information. The quantity and the quality of this information are unattainable by other invasive and non invasive methods. The findings of this thesis suggest strongly that our method can improve our understand-ing of the neoplasia development mechanisms and of tumor growth and metastasis physiology. Corollary, it may become a valuable diagnostic tool that will also facilitate the development and evaluation of new cancer therapies.Περίληψη: Στα πλαίσια της διατριβής αυτής παρουσιάζουμε μία καινοτόμο βιοφωτονική μέθοδο μορια-κής απεικόνισης για την εκτίμηση και χαρτογράφηση των λειτουργικών και δομικών παραμέτρων του επιθηλίου του τραχήλου της μήτρας κατά τη διάρκεια εξέλιξης της νεοπλασίας. Οι εκτιμή-σεις βασίστηκαν σε πειραματικά δεδομένα, που λαμβάνονται μέσω δυναμικής, ενισχυμένης-αντίθεσης, οπτικής απεικόνισης του τραχήλου μήτρας, in vivo. Τα δυναμικά χαρακτηριστικά του μετρούμενου οπτικού σήματος διέπονται από τα φαινόμενα επιθηλιακής μεταφοράς του βιοδεί-κτη. Βάσει αυτών αναπτύχθηκε ένα διαμερισματικό, φαρμακοκινητικό μοντέλο του νεοπλαστι-κού επιθηλίου του τραχήλου που προβλέπει τα δυναμική οπτικά αποτελέσματα σε όλους τους δυνατούς συνδυασμούς τιμών των παραμέτρων του. Εννέα δομικές και λειτουργικές βιολογικές παράμετροι του ιστού εντοπίστηκε να συσχετίζονται με την ανάπτυξη νεοπλασίας οι οποίες εμ-φανίζονται στα μετρούμενα δεδομένα με περίπλοκο τρόπο. Έτσι, χρησιμοποιήσαμε μεθόδους καθολικής ανάλυσης ευαισθησίας με σκοπό τον εντοπισμό του υποσυνόλου εκείνου των παρα-μέτρων εισόδου που αποτελούν τους πιο καθοριστικούς παράγοντες της εξόδου του μοντέλου. Για πρώτη φορά αποδείξαμε ότι είναι δυνατόν να εκτιμηθεί, από δεδομένα δυναμικής οπτικής απεικόνισης μετρούμενα in vivo, οι ακόλουθες συναφής με τη νεοπλασία παράμετροι: ο αριθμός των νεοπλαστικών στρωμάτων, οι διαστάσεις του εξωκυττάριου χώρου, η λειτουργικότητα των σφιχτών δεσμών και το εξωκυττάριο pH. Τεχνικές καθολικής βελτιστοποίησης, έδειξαν ότι οι ε-κτιμήσεις της μεθόδου μας είναι επαρκείς ως προς την ορθότητα και την ακρίβειά τους. Ιδιαίτε-ρα, o αλγόριθμος της Διαφορικής Εξέλιξης συνέκλινε στο σύνολο των τεσσάρων, πιο ευπροσδιό-ριστων, παραμέτρων με σφάλμα περίπου 1%. Κατόπιν τούτου, δείχνουμε ότι οι εκτιμώμενες, σε δύο εκατομμύρια εικονοστοιχεία, τιμές των τεσσάρων αυτών παραμέτρων είναι αρκετά συνεπής με τις πληροφορίες που παρέχονται στη βιβλιογραφία. Τα αποτελέσματά μας είναι μοναδικά υπό την έννοια ότι χάρτες των λειτουργικών και μικροδομικών παραμέτρων μπορούν να υπολογίζονται και εμφανίζονται μαζί για πρώτη φορά, μεγιστοποιώντας κατά συνέπεια τη διαγνωστική πληρο-φορία. Η ποσοτική και η ποιοτική προσέγγιση αυτών των πληροφοριών είναι ανέφικτη από άλλες επεμβατικές και μη επεμβατικές μεθόδους σήμερα. Τα ευρήματα της διατριβής αυτής προτεί-νουν ανεπιφύλακτα ότι η μέθοδός μας μπορεί να βελτιώσει την κατανόηση των μηχανισμών α-νάπτυξης της νεοπλασίας καθώς και της φυσιολογίας των μεταστατικών όγκων. Ως φυσικό ε-πακόλουθο, αυτή η μελέτη μπορεί να αποτελέσει ένα πολύτιμο διαγνωστικό εργαλείο που θα διευκολύνει την ανάπτυξη και την αξιολόγηση νέων θεραπειών του καρκίνου

    Gilthead seabream (Sparus aurata) response to three music stimuli (Mozart-"Eine Kleine Nachtmusik," Anonymous-"Romanza," Bach-"Violin Concerto No. 1") and white noise under recirculating water conditions

    No full text
    This study presents the results of the response of Sparus aurata to three different musical stimuli, derived from the transmission (4 h per day, 5 days per week) of particular music pieces by Mozart, Romanza and Bach (140 dB(rms) re 1 mu Pa), compared to the same transmission level of white noise, while the underwater ambient noise in all the experimental tanks was 121 dB(rms) re 1 mu Pa. Using recirculating sea water facilities, 10 groups, 2 for each treatment, of 20 specimens of 11.2 +/- A 0.02 g (S.E.), were reared for 94 days, under 150 +/- A 10 lx 12L-12D, and were fed an artificial diet three times per day. Fish body weight showed significant differences after 55 days, while its maximum level was observed after the 69th day until the end of the experiment, the highest value demonstrated in Mozart (M) groups, followed by those of Romanza (R), Bach (B), control (C) and white noise (WN). SGR (M = B), %WG (M = B) and FCR (all groups fed same % b.w.) were also improved for M group. Brain neurotransmitters results exhibited significant differences in DA-dopamine, (M > B), 5HIAA (C > B), 5HIAA:5HT (WN > R), DOPAC (M > B), DOPAC:DA and (DOPAC + HVA):DA, (C > M), while no significant differences were observed in 5HT, NA, HVA and HVA:DA. No differences were observed in biometric measurements, protease activity, % fatty acids of fillet, visceral fat and liver, while differences were observed regarding carbohydrase activity and the amount (mg/g w.w.) of some fatty acids in liver, fillet and visceral fat. In conclusion, present results confirm those reported for S. aurata, concerning the observed relaxing influence-due to its brain neurotransmitters action-of the transmission of Mozart music (compared to R and B), which resulted in the achievement of maximum growth rate, body weight and improved FCR. This conclusion definitely supports the musical "understanding" and sensitivity of S. aurata to music stimuli as well as suggesting a specific effect of white noise
    corecore